AITopics | word sense

Collaborating Authors

word sense

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Visualizing and Measuring the Geometry of BERT

Emily Reif, Ann Yuan, Martin Wattenberg, Fernanda B. Viegas, Andy Coenen, Adam Pearce, Been Kim

Neural Information Processing SystemsFeb-11-2026, 13:37:30 GMT

Neural Information Processing Systems http://nips.cc/

information, probe, representation, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)
Europe > Spain (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

A new kid on the block: Distributional semantics predicts the word-specific tone signatures of monosyllabic words in conversational Taiwan Mandarin

Jin, Xiaoyun, Ernestus, Mirjam, Baayen, R. Harald

arXiv.org Artificial IntelligenceNov-24-2025

We present a corpus-based investigation of how the pitch contours of monosyllabic words are realized in spontaneous conversational Mandarin, focusing on the effects of words' meanings. We used the generalized additive model to decompose a given observed pitch contour into a set of component pitch contours that are tied to different control variables and semantic predictors. Even when variables such as word duration, gender, speaker identity, tonal context, vowel height, and utterance position are controlled for, the effect of word remains a strong predictor of tonal realization. We present evidence that this effect of word is a semantic effect: word sense is shown to be a better predictor than word, and heterographic homophones are shown to have different pitch contours. The strongest evidence for the importance of semantics is that the pitch contours of individual word tokens can be predicted from their contextualized embeddings with an accuracy that substantially exceeds a permutation baseline. For phonetics, distributional semantics is a new kid on the block. Although our findings challenge standard theories of Mandarin tone, they fit well within the theoretical framework of the Discriminative Lexicon Model.

contour, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.17337

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.67)
Education (0.67)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

From Ghazals to Sonnets: Decoding the Polysemous Expressions of Love Across Languages

Ali, Syed Mohammad Sualeh

arXiv.org Artificial IntelligenceOct-20-2025

This paper delves into the intricate world of Urdu poetry, exploring its thematic depths through a lens of polysemy. By focusing on the nuanced differences between three seemingly synonymous words (pyaar, muhabbat, and ishq) we expose a spectrum of emotions and experiences unique to the Urdu language. This study employs a polysemic case study approach, meticulously examining how these words are interwoven within the rich tapestry of Urdu poetry. By analyzing their usage and context, we uncover a hidden layer of meaning, revealing subtle distinctions which lack direct equivalents in English literature. Furthermore, we embark on a comparative analysis, generating word embeddings for both Urdu and English terms related to love. This enables us to quantify and visualize the semantic space occupied by these words, providing valuable insights into the cultural and linguistic nuances of expressing love. Through this multifaceted approach, our study sheds light on the captivating complexities of Urdu poetry, offering a deeper understanding and appreciation for its unique portrayal of love and its myriad expressions

artificial intelligence, natural language, word sense, (18 more...)

arXiv.org Artificial Intelligence

2510.15569

Country: Asia (0.14)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.72)

Add feedback

Visualizing and Measuring the Geometry of BERT

Emily Reif, Ann Yuan, Martin Wattenberg, Fernanda B. Viegas, Andy Coenen, Adam Pearce, Been Kim

Neural Information Processing SystemsOct-2-2025, 05:11:15 GMT

Transformer architectures show significant promise for natural language processing.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Towards Universal Semantics With Large Language Models

Baartmans, Raymond, Raffel, Matthew, Vikram, Rahul, Deringer, Aiden, Chen, Lizhong

arXiv.org Artificial IntelligenceJul-8-2025

The Natural Semantic Metalanguage (NSM) is a linguistic theory based on a universal set of semantic primes: simple, primitive word-meanings that have been shown to exist in most, if not all, languages of the world. According to this framework, any word, regardless of complexity, can be paraphrased using these primes, revealing a clear and universally translatable meaning. These paraphrases, known as explications, can offer valuable applications for many natural language processing (NLP) tasks, but producing them has traditionally been a slow, manual process. In this work, we present the first study of using large language models (LLMs) to generate NSM explications. We introduce automatic evaluation methods, a tailored dataset for training and evaluation, and fine-tuned models for this task. Our 1B and 8B models outperform GPT-4o in producing accurate, cross-translatable explications, marking a significant step toward universal semantic representation with LLMs and opening up new possibilities for applications in semantic analysis, translation, and beyond. Our code is available at https://github.com/OSU-STARLAB/DeepNSM.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2505.11764

Country: Europe (0.46)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Solving Word-Sense Disambiguation and Word-Sense Induction with Dictionary Examples

Škvorc, Tadej, Robnik-Šikonja, Marko

arXiv.org Artificial IntelligenceMar-6-2025

Many less-resourced languages struggle with a lack of large, task-specific datasets that are required for solving relevant tasks with modern transformer-based large language models (LLMs). On the other hand, many linguistic resources, such as dictionaries, are rarely used in this context despite their large information contents. We show how LLMs can be used to extend existing language resources in less-resourced languages for two important tasks: word-sense disambiguation (WSD) and word-sense induction (WSI). We approach the two tasks through the related but much more accessible word-in-context (WiC) task where, given a pair of sentences and a target word, a classification model is tasked with predicting whether the sense of a given word differs between sentences. We demonstrate that a well-trained model for this task can distinguish between different word senses and can be adapted to solve the WSD and WSI tasks. The advantage of using the WiC task, instead of directly predicting senses, is that the WiC task does not need pre-constructed sense inventories with a sufficient number of examples for each sense, which are rarely available in less-resourced languages. We show that sentence pairs for the WiC task can be successfully generated from dictionary examples using LLMs. The resulting prediction models outperform existing models on WiC, WSD, and WSI tasks. We demonstrate our methodology on the Slovene language, where a monolingual dictionary is available, but word-sense resources are tiny.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2503.04328

Country:

Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.05)
Europe > Slovenia > Savinja > Municipality of Celje > Celje (0.04)
Asia (0.04)

Genre:

Overview (0.93)
Research Report > New Finding (0.68)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: Visualizing and Measuring the Geometry of BERT

Neural Information Processing SystemsJan-21-2025, 21:49:46 GMT

Originality: This submission uses existing techniques to analyze how syntax and semantics are represented in BERT. The authors do a good job of contextualizing the work in terms of previous work, for instance similar analyses for other models (like Word2Vec). They also build off of the work of Hewitt and Manning and provide new theoretical justification for Hewitt and Manning's empirical findings. Quality: Their mathematical arguments are sound, but the authors could add more rigor to the conclusions they draw in the remarks following Theorem 1. The empirical studies show some interesting results.

bert, geometry, visualizing and measuring, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.87)

Add feedback

Word Sense Linking: Disambiguating Outside the Sandbox

Bejgu, Andrei Stefan, Barba, Edoardo, Procopio, Luigi, Fernández-Castro, Alberte, Navigli, Roberto

arXiv.org Artificial IntelligenceDec-12-2024

Word Sense Disambiguation (WSD) is the task of associating a word in a given context with its most suitable meaning among a set of possible candidates. While the task has recently witnessed renewed interest, with systems achieving performances above the estimated inter-annotator agreement, at the time of writing it still struggles to find downstream applications. We argue that one of the reasons behind this is the difficulty of applying WSD to plain text. Indeed, in the standard formulation, models work under the assumptions that a) all the spans to disambiguate have already been identified, and b) all the possible candidate senses of each span are provided, both of which are requirements that are far from trivial. In this work, we present a new task called Word Sense Linking (WSL) where, given an input text and a reference sense inventory, systems have to both identify which spans to disambiguate and then link them to their most suitable meaning.We put forward a transformer-based architecture for the task and thoroughly evaluate both its performance and those of state-of-the-art WSD systems scaled to WSL, iteratively relaxing the assumptions of WSD. We hope that our work will foster easier integration of lexical semantics into downstream applications.

artificial intelligence, computational linguistic, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2024.findings-acl.851

2412.0937

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
North America > Dominican Republic (0.04)
(13 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.70)

Add feedback

A corpus-based investigation of pitch contours of monosyllabic words in conversational Taiwan Mandarin

Jin, Xiaoyun, Ernestus, Mirjam, Baayen, R. Harald

arXiv.org Artificial IntelligenceSep-12-2024

In addition, Chuang et al. (2024) recently reported that the tonal contours of disyllabic Mandarin words with T2-T4 tone pattern are co-determined by their meanings. Following up on Chuang et al. (2024) research, we present a corpus-based investigation of how the pitch contours of monosyllabic words are realized in spontaneous conversational Mandarin, focusing on the effects of contextual predictors on the one hand, and the way in words' meanings co-determine pitch contours on the other hand. We analyze the F0 contours of 3824 tokens of 63 different word types in a corpus of spontaneous conversational Taiwan Mandarin, using the generalized additive (mixed) model to decompose a given observed pitch contour into a set of component pitch contours. These component pitch contours isolate the contributions to the pitch contour of the variables taken into account in the statistical model. We show that the tones immediately to the left and right of a word substantially modify a word's canonical tone. Once the effect of tonal context is controlled for, the canonical rising (T2) and dipping (T3) tones emerge as low flat tones, contrasting with T1 as a high tone, and with T4 as a high-to-mid falling tone. The neutral tone (T0), which in standard descriptions is taken to primarily depend for its realization on the preceding tone, emerges as a low tone in its own right, the realization of which is modified by the other predictors in the same way as the standard tones T1, T2, T3, and T4. In line with the results from a previous study on disyllabic words with the T2-T4 tonal contour (Chuang et al., 2024), we also show that word, and even more so, word sense, co-determine words' F0 contours, and that, as a consequence, heterographic homophones (e.g., 的, 得, and 地) have their own tonal signatures. Analyses of variable importance using random forests further supported the substantial effect of tonal context and an effect of word sense that is almost as important as that of tonal context.

contour, pitch contour, tone pattern, (15 more...)

arXiv.org Artificial Intelligence

2409.07891

Country:

Asia > Taiwan (0.63)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > Massachusetts (0.04)
(10 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.49)

Add feedback

Coarse-Grained Sense Inventories Based on Semantic Matching between English Dictionaries

Kikuchi, Masato, Ono, Masatsugu, Soga, Toshioki, Tanabe, Tetsu, Ozono, Tadachika

arXiv.org Artificial IntelligenceSep-10-2024

WordNet is one of the largest handcrafted concept dictionaries visualizing word connections through semantic relationships. It is widely used as a word sense inventory in natural language processing tasks. However, WordNet's fine-grained senses have been criticized for limiting its usability. In this paper, we semantically match sense definitions from Cambridge dictionaries and WordNet and develop new coarse-grained sense inventories. We verify the effectiveness of our inventories by comparing their semantic coherences with that of Coarse Sense Inventory. The advantages of the proposed inventories include their low dependency on large-scale resources, better aggregation of closely related senses, CEFR-level assignments, and ease of expansion and improvement.

inventory, sense definition, wordnet, (14 more...)

arXiv.org Artificial Intelligence

2409.06386

Country:

Asia > Malaysia (0.04)
Asia > Japan > Hokkaidō > Hokkaidō Prefecture > Sapporo (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback